Goto

Collaborating Authors

 higher-order interaction


A distributional simplicity bias in the learning dynamics of transformers

Neural Information Processing Systems

The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a "simplicity bias": neural networks prevent overfitting by initially learning simple classifiers before progressing to




5a3674849d6d6d23ac088b9a2552f323-Paper-Conference.pdf

Neural Information Processing Systems

Previous works attempting to close this gap have failed to fully investigate the exponentially growing number of feature combinations which deep networks consider automatically during training. In this work, we develop a tractable selection algorithm to efficiently identify the necessary feature combinations byleveraging techniques infeature interaction detection. Our proposed Sparse Interaction AdditiveNetworks (SIAN) construct abridge from thesesimple andinterpretable models tofullyconnected neuralnetworks.


CAT-Walk: Inductive Hypergraph Learning via Set Walks

Neural Information Processing Systems

Temporal hypergraphs provide a powerful paradigm for modeling time-dependent, higher-order interactions in complex systems. Representation learning for hypergraphs is essential for extracting patterns of the higher-order interactions that are critically important in real-world problems in social network analysis, neuroscience, finance, etc. However, existing methods are typically designed only for specific tasks or static hypergraphs. We present CAT-Walk, an inductive method that learns the underlying dynamic laws that govern the temporal and structural processes underlying a temporal hypergraph. CAT-Walk introduces a temporal, higher-order walk on hypergraphs, SetWalk, that extracts higher-order causal patterns. CAT-Walk uses a novel adaptive and permutation invariant pooling strategy, SetMixer, along with a set-based anonymization process that hides the identity of hyperedges. Finally, we present a simple yet effective neural network model to encode hyperedges. Our evaluation on 10 hypergraph benchmark datasets shows that CAT-Walk attains outstanding performance on temporal hyperedge prediction benchmarks in both inductive and transductive settings. It also shows competitive performance with state-of-the-art methods for node classification.


Causal Discovery on Higher-Order Interactions

Zanga, Alessio, Scutari, Marco, Stella, Fabio

arXiv.org Machine Learning

Causal discovery combines data with knowledge provided by experts to learn the DAG representing the causal relationships between a given set of variables. When data are scarce, bagging is used to measure our confidence in an average DAG obtained by aggregating bootstrapped DAGs. However, the aggregation step has received little attention from the specialized literature: the average DAG is constructed using only the confidence in the individual edges of the bootstrapped DAGs, thus disregarding complex higher-order edge structures. In this paper, we introduce a novel theoretical framework based on higher-order structures and describe a new DAG aggregation algorithm. We perform a simulation study, discussing the advantages and limitations of the proposed approach. Our proposal is both computationally efficient and effective, outperforming state-of-the-art solutions, especially in low sample size regimes and under high dimensionality settings.


Collective decision-making with higher-order interactions on $d$-uniform hypergraphs

Njougouo, Thierry, Carletti, Timoteo, Tuci, Elio

arXiv.org Artificial Intelligence

Understanding how group interactions influence opinion dynamics is fundamental to the study of collective behavior. In this work, we propose and study a model of opinion dynamics on $d$-uniform hypergraphs, where individuals interact through group-based (higher-order) structures rather than simple pairwise connections. Each one of the two opinions $A$ and $B$ is characterized by a quality, $Q_A$ and $Q_B$, and agents update their opinions according to a general mechanism that takes into account the weighted fraction of agents supporting either opinion and the pooling error, $α$, a proxy for the information lost during the interaction. Through bifurcation analysis of the mean-field model, we identify two critical thresholds, $α_{\text{crit}}^{(1)}$ and $α_{\text{crit}}^{(2)}$, which delimit stability regimes for the consensus states. These analytical predictions are validated through extensive agent-based simulations on both random and scale-free hypergraphs. Moreover, the analytical framework demonstrates that the bifurcation structure and critical thresholds are independent of the underlying topology of the higher-order network, depending solely on the parameters $d$, i.e., the size of the interaction groups, and the quality ratio. Finally, we bring to the fore a nontrivial effect: the large sizes of the interaction groups, could drive the system toward the adoption of the worst option.


Higher-Order Causal Structure Learning with Additive Models

Enouen, James, Zheng, Yujia, Ng, Ignavier, Liu, Yan, Zhang, Kun

arXiv.org Machine Learning

Causal structure learning has long been the central task of inferring causal insights from data. Despite the abundance of real-world processes exhibiting higher-order mechanisms, however, an explicit treatment of interactions in causal discovery has received little attention. In this work, we focus on extending the causal additive model (CAM) to additive models with higher-order interactions. This second level of modularity we introduce to the structure learning problem is most easily represented by a directed acyclic hypergraph which extends the DAG. We introduce the necessary definitions and theoretical tools to handle the novel structure we introduce and then provide identifiability results for the hyper DAG, extending the typical Markov equivalence classes. We next provide insights into why learning the more complex hypergraph structure may actually lead to better empirical results. In particular, more restrictive assumptions like CAM correspond to easier-to-learn hyper DAGs and better finite sample complexity. We finally develop an extension of the greedy CAM algorithm which can handle the more complex hyper DAG search space and demonstrate its empirical usefulness in synthetic experiments.


A distributional simplicity bias in the learning dynamics of transformers

Neural Information Processing Systems

The remarkable capability of over-parameterised neural networks to generalise effectively has been explained by invoking a "simplicity bias": neural networks prevent overfitting by initially learning simple classifiers before progressing to